PubRunner: A light-weight framework for updating text
نویسندگان
چکیده
Biomedical text mining promises to assist biologists in quickly navigating the combined knowledge in their domain. This would allow improved understanding of the complex interactions within biological systems and faster hypothesis generation. New biomedical research articles are published daily and text mining tools are only as good as the corpus from which they work. Many text mining tools are underused because their results are static and do not reflect the constantly expanding knowledge in the field. In order for biomedical text mining to become an indispensable tool used by researchers, this problem must be addressed. To this end, we present PubRunner, a framework for regularly running text mining tools on the latest publications. PubRunner is lightweight, simple to use, and can be integrated with an existing text mining tool. The workflow involves downloading the latest abstracts from PubMed, executing a user-defined tool, pushing the resulting data to a public FTP or Zenodo dataset, and publicizing the location of these results on the public PubRunner website. We illustrate the use of this tool by re-running the commonly used word2vec tool on the latest PubMed abstracts to generate up-to-date word vector representations for the biomedical domain. This shows a proof of concept that we hope will encourage text mining developers to build tools that truly will aid biologists in exploring the latest publications. This article is included in the Container collection. Virtualization in Bioinformatics 1 2 3 4
منابع مشابه
PubRunner: A light-weight framework for updating text mining
Biomedical text mining promises to assist biologists in quickly navigating the combined knowledge in their domain. This would allow improved understanding of the complex interactions within biological systems and faster hypothesis generation. New biomedical research articles are published daily and text mining tools are only as good as the corpus from which they work. Many text mining tools are...
متن کاملPubRunner: A light-weight framework for updating text mining results
Biomedical text mining promises to assist biologists in quickly navigating the combined knowledge in their domain. This would allow improved understanding of the complex interactions within biological systems and faster hypothesis generation. New biomedical research articles are published daily and text mining tools are only as good as the corpus from which they work. Many text mining tools are...
متن کاملA Sociolinguistic Scrutiny of the Great Gatsby and its Persian Translation in Light of Hatim and Mason’s Framework
Translation studies essentially deals with a socio-communicatively driven and contextualized enterprise. Viewed hence, it seems that no discipline tends to provide the possibility of studying the interrelations between interlocutors to generate meaning within the interactive social context as precisely as sociolinguistics (Federici, 2018). A sociolinguistic approach to translation seems to be i...
متن کاملFEM Updating for Offshore Jacket Structures Using Measured Incomplete Modal Data
Marine industry requires continued development of new technologies in order to produce oil. An essential requirement in design is to be able to compare experimental data from prototype structures with predicted information from a corresponding analytical finite element model. In this study, structural model updating may be defined as the fit of an existing analytical model in the light of measu...
متن کاملOptimal Interactive Content-Based Image Retrieval
No doubt, the performance of a Content-Based Image Retrieval (CBIR) system depends on a) how efficient the image visual content is represented and b) the degree of importance, which is assigned to each content-descriptor. In the first case, efficient visual representation is achieved, apart from the extraction of appropriate descriptors, through a proper organization of them [1]. The second cas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017